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Section 3, INTRODUCTION 



This is the Final Report required under Contract No. 
P19628-72-C-(11G3y an effort to develop an on-line aid for 
naive users of the ARPA Computer Network. We have named 
this aid NET-SCHOLAR^ since it is a SCHOLAR-like system 
[Carbonell, 1970; Carbonell and Collins, 1970, 1972] 
oriented towards helping people how to use the ARPA NETwork 
(ARPANET) [Heart et al, 1970; Crocker, 1972] and to make use 
of the computer facilities available through it* The 
original SCHOLAR system, oriented towards tutoring the 
geography of South America, has been renamed GEO-SCHOLAR* 

The most important difference between a SCHOLAR system 
designed for the tutoring of Geography and a SCHOLAR system 
aimed at helping users of a computer network is the latter 's 
necessity to deal with procedural and functional 
information. This, in turn, requires that the system be 
able to handle action verbs. 

Most of the questions that a student asks in the 
context of Geography can be asked comfortably without using 
action verbs, because such questions deal mostly with static 
information. But when the subject becomes how to use the 
ARPANET, such a restriction is no longer acceptable • It 

3 



becomes imperative to be able to handle questions such as 
•How do I save a file in BBN-TENEX* or 'How does one edit 
text with TECO,' The work reported in this document is 
therefore centered around the representation and handling of 
functional and procedural information, within the context of 
the ARPA network. 

What are the characteristics of this type of 
information? To gain some insight into this question, we 
conducted a number of tutoring sessions, involving a 
knowledgeable user of the ARPANET tutoring a naive (with 
respect to computer networks) computer user. The results 
are described in Section 2. The paramount importance of 
being able to handle action verbs, operations, purposes, and 
procedures became absolutely clear. Verbs were classified 
and a taxonomy of questions was constructed, providing the 
basis for the next phase of our work. 

In Section 3 we describe our method for encoding the 

nieaning of actions, operations, purposes, procedures, etc, 

which is based on the use of a suitably modified case 
grammar, 'a la' Fillmore [Fillmore, 1968], 

Section 4 describes a predecessor of NET-SCHOLAR, This 
temporary system (TMPNET-SCHOLAR) is not able to comprehend 
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questions involving action verbs, but once a question is 
paraphrased accordingly, it produces an answer that is 
comparable, in- its use of verbs, to that of the target 
system. TMPNET- SCHOLAR came into being because we needed a 
system with which to test and shake down the data base while 
the English comprehension routines of NET-SCHOLAR were being 
built. • 

In Section 5, we describe NET-SCHOLAR proper, with its 
English comprehension capabilities developed to the point of 
vinderstanding simple questions (no subordinate clauses, for 
example) involving action verbs. The description is 
illustrated with actual protocols. A Summary and 
Conclusions Section closes the Report. 

As a closing note, we would like to emphasize that 
NET-SCHOLAR is by no means a finished product, and that in 
our followup work we will continue to improve the capability 
of SCHOLAR systems to interact naturally with people. 
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Section 2 TUTORING PROTOC0T,S 
2.1 Introduction 



In order to gain some insight into what kind of 
information a naive user would need to know in order to make 
intelligent uso of the ARPANET ^ we decided to conduct some 
actual tutoring sessions in which a user familiar with TENEX 
but almost totally ignorant about computer networks was 
tutored in an interactive manner by another user well versed 
and knowledgeable on the ARPANET. 

The format of the sessions was left entirely free. We 
asked each of two tutors to teach two naive members of BBN's 
staff (naive in the sense that they had never used any 
computer network before) how to use the ARPANET so that they 
might be able to accomplish some reasonable goal, like being 
able to talk to several hosts or to copy files through the 
network. The tutor and the student sat in front of a 
teletype and the conversation was taped and transcribed. 

It is important to give an idea of what transpired 
during these tutorial sessions. To this end we present 
first a digest of the most important topics covered, how the 
tutor chose to present them and the questions that arose. 
The digest summarizes about 52 pages of transcript. Then we 
describe how we analyzed these transcripts and finally we 
present a classification of questions that users are likely 
O 6 
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to ask when they learn how to use the ARPANET, 

2.2 Digest of one of the tutoring sessions 

General description of the ARPANET 
Topological aspects 
Sites and hosts (local and remote). 
Network Control Programs (NCP) and their function. 
TELNET, the NCP of TENEX. 

Operational description of TELNET. 

Tutor logs in to 3BN's TENEX system and calls 
TELNET. 

LOCAL and REMOTE commands modes. 

Tutor states their existence and gives a general 
idea of what they do. 

Points out that LOCAL command mode is often called 
COMMAND mode. 

Command recognition and other features designed to 
facilitate use. 

Tutor shows use of abbreviationiS, help features , 
and how TELNET refuses to accept illegal 
characters. 
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CONNECTION. TO Conunand 

Tutor demonstrates by connecting to SRI-ARC^ a 
TENEX system located at Stanford Research 
Institute. This system is the home of the Network 
Information Center (NIC) . 

Logging in to a foreign host 

Tutor log^ in using appropriate identification and 
password. A mask is printed out^ to hide the 
password instead of the expected echo expression. 
More about this later. Calls the NLS subsystem to 
demonstrate how to access information stored in 
the NIC. 

Question: How do you break the connection? 
Tutor demonstrates; First logs out of foreign host 
(student is reassured that this will not log out 
of local host). Then y types escape character (+Z) 
and issues the disconnect command. Points out 
that disconnecting without logging out would cause 
the job to remain detached. 

The question of echos 

Referring to the problem of masked passwords, the 
tutor talks about FULL DUPLEX and HALF DUPLEX, and 
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about LOCAL ECHO MODE and REMOTE ECHO MODE* 
Sets the echo mode to LOCAL* 



The function of the escape character (+Z) 

i) Returns control to TELNET 

ii) Sets the command mode to COMMAND 

iii) Returns to REMOTE command mods upon 
terminating a command. 



Talking to tv;o hosts at the same time 

The tutor proposes this as a further demonstration 

of the use of the CONNECT command. 

Questions: Did you disconnect from SRI yet? There 

is a difference between logging out and 

disconnecting, right? How many connections can you 

have? 

Tutor connects to RAND-SRC (the TENEX system at 
RAND Corporation) and logs in. Then he returns to 
TELNET (in BBN) and connects to BBN (through the 
network) . This is confusing to the student, so 
the tutor describes two commands that may help in 
situations such as the present: LIST CONNECTIONS, 
RETRIEVE CONNECTION (sets user in the desired 
connection) , and NET STATUS (it looks to see which 
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connections are going out of each socket)* 
Question: VJhat is a socket? 

Description of sockets (socket numbers) 

Tutor explains what they a b by analogy with jobs. 
"Job numbers are just a way of tal];ing about 
(identifying) particular interactions • In the same 
manner^ socket numbers are the way hosts have of making 
sure data is channeled right." Points out that when a 
socket in a host is made to correspond with another 
socket in another (or the same) host^ a connection is 
made. The subject is complex and there is evidence of 
confusion in the student's mind. 

Question: I still do not understand what you mean by 
socket. Why do you need four sockets for a connection? 
Why are socket numbers supposed to apply to job numbers 
or user numbers? 

The tutor proposes to clear up confusion by way of 
example — she is going to start two programs (each in a 
different host) that will talk to each other. This 
exercise proves to be useful but many doubts remain. 



*A new command (WHERE. AM. I) has been added recently to 
TELNET to cope with situations such as the one 
described. 

o 10 
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Tutor and student agree on adjourning and reconvening 
the followinq day, 

2.3 Analysis of transcripts 

As a first pass in analyzing the transcripts, we 
set out to investigate how important action verbs are 
in dealing with this kind of subject matter, the extent 
to which they are used, and in what context. By action 
verbs we mean all verbs except the verb have and the 
linking verb be {no other linking v^^rb appeared on the 
transcripts?) . 

We began by extracting all the action verb phrases 
that appeared in one of the transcripts, and we 
catalogued them as follows: for each new verb phrase 
that we encountered we made an entry on a card headed 
by the main constituent of the verb phrase, a 
clas3ification of its ^.ype, and a back pointer to where 
it occured in the transcript. For example, the first 
time the verb phrase "have to remember" occurred, we 
entered it on a card headed "remember", and classified 
it as an 'aux-Vrb, Pres* verb phrase meaning that its 
Tense is present and that the phrase is constructed 
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with 'have to ' as an Auxiliary (or Modifier ) • 
Thereon y all repetitions of this same verb phrase were 
ignored. 

In this manner we catalogued approximately 28fl 
different verb phrases of 148 different main 
constituents (action verbs) • There are at least two 
reasons for this relatively high number of verbs. 
Firstly y verbs were indexed according to their meaning ^ 
and not according to their spelling. For example ^ we 
had 2 different cards headed by the verb call — to 
call (summon j a program (as in "Call TELNET" )r and to 
call [name] a file^ a connection^ or a host^ (as in 
"The host of Stanford Research Institute is called 
SRI -ARC") — and 5 cards headed by the verb go — they 
convey the same meaning as disappear ("it is gone")^ 
traverse ("bytes going across the network**) ^ explain 
("we could go into why this is so") ^ exi t or leave 
("we must go out"), and undergo ("The NCP's go through 
a hand-shake procedure"). 

A second reason for the Jhigh**^mJfc^e^E^^^ different 
verbs encountered was the presence of synonyms or 
quasi-synon^/ms such as send and transmit ; type in, 
give , issue , say , tell , (a command to the computer 
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system) ; start s begiri y initiate ; stop ^ finish r end y 
terminate y complete ; etc. 

All this prompted us to attempt building a 
taxonomy of the verbs used in the transcript ^ to see if 
some kind of structure or hierarchy could be uncovered. 
To do this in a manner as free as possible ^ we just 
laid out the cards on a large table and set out to 
group them into clusters and then tlie clusters into 
superclustersy etc. At the beginning our criterion for 
grouping was completely unformulated — they wore just 
grouped y sonehow. After several trials some criteria 
began to emerge — there appeared to be physical action 
verbs y (denoting operations ^ functions and commands) 
and mental action verbs (addressing mental actions ^ 
intellectual functions ^ and states of mind) . Ignoring 
the overlaps between tliese tv70 main categories ^ let us 
designate the first as action verbs and the second as 
mind verbs. Within each of these main categories there 
were several levels of generality ^ in such a way that a 
verb in a high level somehow conveyed to us part of the 
meaning of several verbs in the level below. A 
frequently occurring feature was the presence of 
opposite pairs, like save vs. retrieve ^ start vs. 
stop y allow vs. limit y come vs. go , remember vs. 
forge t y etc. 
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Without attempting to justify the procedure in any 
other ground than those alluded to above ^ we present 
below a classification of the verbs extracted from the 
transcript. We represent levels by degree of 
indentation y so that verbs with greater specificity of 
meaning (lesser generality) are indented more than 
verbs with less specific meaning. In order to save 
vertical space^ we have strung out horizontally verbs 
thiat should be considered at the same level — the 
criterion is that all verbs on a line have the degree 
of generality represented by the indentation of the 
first verb in the line. We have also enclosed in 
parenthesis synonyms ^ quasi-synonyms^ and antonyms. 
Where appropriate ^ comments have been introduced to 
explain the criterion followed for that particular 
class. 
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Classification of action verbs 

Verbs that address the action or are used in 
circumlocutions about the action 
(DO, PERP0PJ1, EXECUTE) 

Verbs that create new wholes using parts 
MAKE, (COftPOSE VS. SEPARATE) 

Intent modifiers 

HELP, TRY, UNDERGJ, ORDER, COMMAND, USE 

Affecting the state or course of the action 
(IIAPPEri, TAKE PLACE) 

((START, BEGIN, INITIALIZE) VS. (STOP, FINISH, 
END, COMPLETE)) 

((INTERRUPT, BREAK) VS. CONTINUE) 
(WAIT, TAKE TIME) 
((ALLOW, LET) vs. LIMIT) 



Specific action verbs 
MOVE 

((COME, ARRIVE) vs. (GO, LEAVE)), RETURN 
PUT, LEAD VS. FOLLOW 

Alluding to possession 

(GIVE vs. TAKE), (RID vs. HOLD), KEEP, SET 

Transmission of information 
COMMUNICATE, INTERACT 

((ASK, (ASK FOR, REQUIRE)), (GET, OBTAIN)) 

TALK ABOUT 

((CALL, TALK TO, CONNECT) VS. DISCONNECT) 
Applied to commands 

(TYPE, TYPE IN) vs. TYPE OUT), PROVIDE, 
ISSUE GIVE, SAY, TELL) 

Commands and their imperatives 
(SAVE vs. RETRIEVE), (CALL IN, SUMMON) 
((COPY, TRANSFER), (( SEND, TRANSMIT) 
VS. RECEIVE)) 
(LIST, PRINT OUT) , RUN 

((LOGON, LOG IN) VS. (LOG OUT, GET 
OUT, EXIT)) 

SPECIFY, (CALL,, NAME), ECHO 
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PERCEIVE 

(SEE, LOOK), (HEAR, LISTEN TO) 

Classification of Mind Verbs 

THINK, REALIZE, MEAN 
(LEARN, KNOW) 
REMEMBER VS. FORGET) 

DECIDE, (SHOW, DEMONSTRATE), RECOGNIZE 
ESTABLISH 

(ASSUME, SUPPOSE), (GUESS, MAKE UP) 
ACCEPT, (THINK, BELIEVE) 
((CONFUSE (CONCERN, WORRY)) vs. (EXPLAIN 
UNDERSTATJD) ) 

REFER, (CORRESPOND, PERTAIN) 
COMPARE 

RESEMBLE, DISTINGUISH, EQUATE 



This initial classification helped us identifying an 
important class of representations for the encoding of verbs 
in a semantic network, as we shall see in Section 3. 
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2.4 The question of questions 



In order for NET-SCHOLAR to be a valuable Question 
Answering System^ it must be able to answer most of the 
questions posed in real life by most naive users of the 
ARPANET. Since the set of questions asked during the 
tutorial sessions provide a starting basis for the kind of 
infoirmation the system will have to process and interpret^ 
we set out to extract and classify them. The following list 
of questions was put together by abstracting representative 
ones from the transcripts and by adding a few of our own. 

LIST OF QUESTIONS 



1. Is <host name> up? 

2. Can I transfer files through the network? 

3. How do I interrupt (break) a command in TELNET? 

4. What is the command to insert a string using TECO? 

5. How do I log in to a host using the network? 

6. What is the meaning of **SETTINGS LOADED" in TELNET? 

7. What is the purpose of the command ; Y FILENAME in TECO? 

8. How do I read in a file for editing in TECO? 

9. How do I edit at BBN? 

10. Where am I now? 
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In terms of pragmatics^ these questions can bo 
classified in the following four types? 



!♦ State of... question 
Examples: 

la. Is SRI up? 

lb. Where am I now? 

2. Means or methods for performing an action 
Prototype : 



Examples: 

2a. Does ;Y read in files in TECO? 

2b. Can I transfer files over the network? 

2c. How does one interrupt a command in TELNET? 

2d. How do I edit at BBN? 




do (noun of action -object-prep phrase. 



action verb 
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Addressing unnamed actions or methods 
Prototype : 



What is 



/ command 
method 




object-prep phrase 
state of. . . 



Examples: 

3a. What is the command for inserting a string in 
TECO? 

3b. How do I get out of remote mode in TELNET? 
3c. How do I find out if SRI is up? 



Purpose of... 
Prototype: 

(use of 
meaning of 
purpose of 



-object-prep phrase i 



l^xamplcs: 

4a. What is the meaning of the comment "SETTINGS 
LOADED" in TELNET? 

4b. What is the purpose of the command ;Y 
FILENAME in TECO? 



Undoubtedly, there are many more types of questions 
that we have not included (some of them deliberately) in 
this preliminary classification. However, the object of 
this part of our work was to provide us with a framework, an 
initial basis from which to start and proceed to build a 
data base with the features that would enable NET-SCHOLAR to 
deal with those types of questions. 
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Section 3 FUNCTIONAL AND PROCEDURAL REPRESENTATIONS 



3.1 Introduction 

Most of the questions that students ask about the geography 
of South America can be asked comfortably without resorting 
to action verbs. This is because such questions deal mostly 
with static information ^ for which the use of action verbs 
is seldom necessary. For this type of questioning 
descriptive representations of knowledge about static 
reality ~ such as are used in the data base of GEO-SCHOLAR 
are adequate. 

The evidence presented in Section II however ^ points 
out the need for a capability to deal with functional and 
procedural information ^ in addition to static information ^ 
when the subject shifts from geography to how to use the 
ARPANET. ThiSr in turn^ requires a system capable of 
handling questions involving action verbs. 

Talking about computer memories^ for example, we may 
say that they arc hardware devices , that they are a part of 
every computer , that they may have certain attributes (such 
as size, word length, access time, data transfer rate, 
etc«), that cores, drums, disks, etc. are some examples of 
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memories, etc. This type of knowledge can be represented in 
a data base 'a la* GEO-SCHOLAR. But the tacts, for example, 
that memories store. data, or that computer users save files 
in disks, would not be ropresentable in such data base. We 
need, therefore, to have a capability for representing 
functional and procedural knowledge* (what memories are used 
for and how they are used for that purpose) , which in turn 
requires a capability for representing the meaning of action 
verbs. 

In what follows we begin with a brief review of the 
classical grammar of action verbs, followed by a description 
of case grammar. [Fillmore, 1968]. This is necessary since 
Fillmore's ideas provided the theoretical framework in which 
our representation of verb meanings is based. Then we 
describe how we actually implemented these ideas and how we 
incorporated case-structure descriptions in our Semantic 
Network. 

3.2 Action verj?s 

A commonly accepted way of defining verbs is that they 
are words used to assert or express action, state or 



*A parallel study [Collins, et al, 1973] addresses the 
problem of how to teach procedural knowledge. 



condition y or being. 



Verbs that express condition or being, such as be , 
seem , grow , feel , become , appear , are also called linking 
verbs because they merely link the subject with a predicate 
complement ~ these verbs subordinate their meaning to the 
function of connecting the predicate idea with the subject. 
One of these verbs, be, and the verb have in its possessive 
and relational meanings , are the only verbs needed by 
SCHOLAR in the geography application. Therefore, we can say 
that handlino verbs is, generally speaking, an entirely new 
problem for SCHOLAR-like systems. 

3.2.1 Syntactic descriptions 

Let us begin by reviewing the fundamentals of classical 
or traditional grammar that are applicable to verbs. The 
verb related a subject (a being, force or instrument that 
instigates or gauses tiie action or state identified by the 
verb) to a direct object that receives or is the consequence 
of such action. Sometimes the meaning of a verb is clear 
without a direct object. An example is: 
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si The computor crashed . 

But most times a direct object 
action verbs are associated 
verbs are callod transitive. 



is needed ~ in general , 
with direct objects ^ and the 



s2 FTP transfers files through the network 

Some action verbs (give, being y shoW y ask y tell y etc.) are 
often followed by a second object (the indirect object) to 
name the person or thing for whom the action is performed. 

s3 Show me a list of TELNET commands 

We said before that verbs were words used to assert or 
express action ^ condition or state^ or being. Actually , 
instead of a single word^ we may have verb phrases, or 
phrasal verbs denoting different aspects, moods, and time of 
occurrence of the action (Ex. should have been told) . 

Verb phrases can be classified in eight possible forms that 
can be conveniently represented as follows [Chomsky 1965, 
page 43] 

Verb Phrase— ^Aux + Verb 

Aux—^ Tense (Modal) (Perfect) (Progressive) 
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The meaning of the notation is that Tense is always present, 
while Modality and Perfect or Progressive aspects may or may 
not be present. Time of occurrence of the action (effected 
in the past, future, extending into the future, etc.) as 
well as its possibility, its optional or doubtful character, 
its necessity, etc. can be very precisely encoded in this 
manner. 

Certain verb forms are no verbs at all. Active and 
passive participles and participial phrases are often used 
as adjectives, gerunds are used as nouns, and infinitives 
can be used as nouns, adjectives or adverbs. Examples of 
active and passive participles used as adjectives ares 

the rotating disk 

the connecting link 

the stored data 

the used time 

When participial phrases are used as adjectives, they can 
have modifiers and objects of their own. 

s4 Linking computers throughout the continent , the 
ARPANET serves a valuable function. 



Gerunds are ing forms of verbs that are used as nouns. In 
s5, the object is a gerund phrase. 
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s5 Try connecting to SRI- ARC 



When infinitives are used as nouns, they can often be 
interchanged with gerunds* 

s6 Try to connect to SRI-ARC 

When used as adjectives or adverbs, infinitives can usually 
be interchanged with ••for V-ing"* 

s7 Devices are used to transfer data 

adjective phrase 
(modifies devices) 

s7a Devices aro used for transferring data 

The ••to" may be elliptic after such verbs as help or 
make * Certain properties, features and additives that 
characterize the behavior of certain verbs must be taken 
into account in the syntactic descriptions* What follows is 
a ••potpourri •• of such facts* As such, it is not intended to 
be exhaustive but only illustrative of the complexity of the 
problem and of the wide range of idiosyncracies exhibited by 
common verbs* 
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Certain transitive verbs may or may not lend themselves 
to intransitiveness (i.e,^ object deletion) • While read 
applies to a narrow range of objects and can be employed 
intransitively y keep applies to a much wider range of 
objects and cannot be used intransitively. 

Certain verbs ^ such as own ^ denote an action that tends 
to prolong itself indefinitely in time. Since the 
progressive aspect denotes precisely that (i.e.^ continuing 
action) , it is not surprising to realize that own does not 
lend itself to take progressive form — indeed such form 
would be redundant. 

Verbs can be classified according to the class of 
subjects and objects that they will take^ such as 
abstractedness of subjects or animatedness of objects. For 
example y only time-intervals elapse ^ and only physical 
objects movB y (not programs^ sincerity^ or chess) . Other 
verb categorizations can be made according to a) whether or 
not they take a "like" predicate^ such as in 

s9 The Tenex Monitor handles the EXEC like an ordinary 
user program / 

or, b) according to the type of Prepositional phrases they 
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take, such as at, into , after , for , up following the verb 
look. In this case prepositions carry meaning; i.e., look 
at (direct one's gaze), look for (seek), look into 
(examine), etc. 

3.2.2 Fillmore's case grammar 

One way of encompassing many of the facts about verbs 
we just finished describing has been proposed by Fillmore 
(Fillmore, 1968] . Since we have adopted many of his ideas 
in the construction of our data base, a brief summary is 
therefore in order. 

Fillmore takes the view that a simple sentence is 
constituted by a verb plus several noun phrases. Each noun 
phrase is associated with the verb in a particular case 
relationship, and each case relationship occurs only once 
(except where conjunction occurs). 

The array of cases in any given simple sentence typifies the 
sentence, and verbs can be classified according to the 
sentence type they may be plugged in. 

The case relationships are: 
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1) AgentivG (A). The instigator (typically animated^ 
but not necessarily so) of the action identified by the verb 

2) Instrumental (I). The inanimate force or object 
that is the cause of the action 

3) Dative (D) • The animate being affected by the 
action 

4) Factitive (P) • The object or being resulting from 
the action of the verb 

5) Locative (L) • Where did the action happen. Also in 
what orientation 

6) Objective (0,S) . The thing that is affected by the 
action. It may be a subordinated sentence (S) . 

This is a non-exhaustive list — - Time and Bencf active 
cases appear to be necessary additions, and many others have 
also been proposed. 

Svibjects, objects r and prepositional phrases are formed 
by rules patterned after sentence types. For example^ in an 
A, O active sentence, A is the subject and O is the object; 
in an I, 0 active sentence, I is the subject; in an A, I, 0, 
active sentence, A is the subject, 0 the object, and I is 
appended in a prepositional phrase beginning with "with". 
If neither A nor I are present, the O is the subject. 



ERIC 



Verbs and nouns aro plugged into the case frame (array 
of cases) of a sentence according to their lexical features. 
Thus, for example , animated nouns can be A or but 
abstract nouns cannot be L. Lexical features for verbs are 
represented in a manner similar to sentence types. For 
example, the sentence 

slO Users edit text with TECO 

has a case frame [A, 0,1], and for the verb edit the list 

((A), (O) , (I)] represents that case frame. The parentheses 

denote options. Other examples are open [O, (L) , (A)] with 
an obligatory 0, and logout (D, (L) (A) ] . 

In addition, verbs are classified: 

a) by ideosyncratic choices of particular nouns as 
subjects or objects, overruling what is determined by the 
general rules, 

b) by the choice of preposition associated with each 
case relationship, and 

c) by ,thc choice of complementizer (that, -ing, for - 
to - for verbs taking the S case) . 
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As examples of a) above, the verbs belong , want , and think s 
are accepted by the frame tO,Dl but belong takes 0 as 
subject while want and think take D as subject. 

The prepositions generally associated with each case 
are as follows: 

A — by 

I — b^ if no A, else with 

0,F-— none 

D-«to 

L — on^, at, inr etc.* depending on noun and verb. 

Preposition deletion (and sometimes substitution) is a 
frequent grammatical transformation {e»g«, deletion of by 
when the agentive is in the subject position) • 

3. 3 Semantic networks for action verbs 

Let us now see how the actual representation of actions 
and purposes may be implemented in the data base* The 
reader is assumed to be familiar with the structure and the 
format of the data base for GEO-SCHOLAR* [Carbonell, 1970; 
Carbonell and Collins, 1970, 19721 
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We beqin with representations of action verbs. At 
least 5 kinds of information appear to be 
necessary— syntactical , case-structural , procedural ^ and 
ontoloqical information ^ and information about conditions 
and requirements. These categories apply to verbs in 
general y but not necessarily to each particular verb. Let 
us examine them. 
1) SYNTAX Information 

The heading (the unit's name) is the infinitive form of 
the verby and is followed by a set of descriptors. The 
first descriptor is VRB^ followed by synonyms or other verbs 
of closely similar meaning. This is the only syntactic 
descriptor incorporated in the version of NET-SCHOLAR to be 
described in Section 5. In future SCHOLAR systems of this 
type we may incorporate more descriptors. A second 
descriptor could be a list of the conjugation irregularities 
the verb may exhibit^ if any. 

1st descriptor 2nd descriptor 

Example: WRITE ( (VRB WRITE) ( (CONJIRR WROTE WRITTEN)...) 
A third descriptor could be a list of case frames ^ 'a la* 
Fillmore y of the sentence types into which the verb can be 
plugged in. 

Example : 

WRITE [(...)(...) (OBJ AGENT (OPTION LOC) ) 
meaning that OBJ and AGENT (Objective and Agentive) r^re 



required case-relationships ^ while the locative case 
relationship (LOC) may or may not be specified. 

An important observation to be made at this point is 
the restriction in meaning of the verb write that is 
implicit in the choice of this second descriptor. We are 
concerned, in the context of tutoring naive users of the 
ARPANET, neither with producing novels or plays as writers 
do, which would !tiake OBJ optional, nor with the process of 
forming letters or symbols, which would require the 
Instrument case— we restrict the meaning of the verb to the 
setting down of information so as to make it possible to 
read, as when writing programs or writing files on magnetic 
tapes . 

2) CASE Information 

By this we mean a characterization of the noun phrases 
that, with the given verb, form sentences that are both 
meaningful and relevant. We do this by listing the nouns or 
the most general class of nouns that 'agree' with the given 
verb under the various case relationships. 
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Examples : 

Under DEliETE 

(OBJ NIL FILE JOB DATA) 

(AGENT NIL USER) 

(INSTR NIL PROGRAM COMMAND COMPUTER/SYSTEM) 

Observe that instead of listing CHARACTER^ STRING, WORD, 
etc, that 'agree' with DELETE in the objective case 
relationship, we have used their SUFERC, namely DATA» 

3) Conditions and Requirements 

Sometimes, we want to represent certain conditions 
and/or requirements that must prevail so that the action 
identified by a verb may take place. 

For example, the computer system may recognize a 
command after the user has typed in the first few characters 
of the command, i.e.: the computer prints the rest of the 
command. In order for this to take place, the user must 
have typed in enough characters so that the command can be 
uniquely identified. We represent this type of information 
as follows: 
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Under RESPONSE {Under RECOGNITION MODE) 



(PRINT NIL ( AGENT NIL COMPUTER/SYSTEM) 
(OBJ NIL (COMMAND NIL WHOLE) ) 
(CONDITION NIL 

(TYPE NIL (AGENT NIL USER) 
(OBJ NIL 

(CHARACTERS NIL SUFFICIENT) ) ) ) ) 

4 ) Procedural information 
When we say 

"The computer system recognizes the intended command 
for the user by means of the recognition mode,'' 
we specify an instrument (recognition mode) for the action 
of the computer recognizing a command for the benefit of the 
user. However y naming the instrument may not be enough for 
a user to go ahead and use effectively the information. He 
may need some "how to"* or procedural information, to amplify 
the instrumental information. Such procedural information 
can be paraphrased (continuing our example) , 

"by the user's typing the initial string of the 
command, followed by altmode*" 

*Typing the Altmode character, activates the recognition 
mode of TENEX, 
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We represent such information as follows: 



Under PURPOSE (Under RECOGNITION MODE) 
(RECOGNIZE NIL 

(AGENT NIL COMPUTER/SYSTEM) 
(BENEF NIL USER) 
(INSTR NIL RECOGNITION/MODE) 
(OBJ NIL (COMMAND NIL INTENDED)) 
(PROCEDURE NIL 
(TYPE NIL 

(AGENT NIL USER) 

(INSTR NIL RECOGNITION/MODE) 

(OBJ mh ($SEQ 

(STRING NIL INITIAL 
(OF NIL COMMAND)) 
ALTMODE/CHARACTER) ) ) ) ) 
The attribute $SEQ specifies that the actions listed under 
it must be performed successfully and successively, in the 
given order. 

5) Other Ontoloqical Information 

The key component of this part of the representation is 
a more general verb of which the given verb is an instance. 



ERIC 



embedded in the standard SUPERC format of SCHOLAR. At a 
lower level of relevancy there is a definition of the 
action* 

Examples: 

Under WRITE 

(SrJPERC NIL COPY (STORE (I 3) (ON NIL MEMORY PERIPHERAL))) 
Under INSERT 

(SUPERC NIL EDIT (PUT (I 3) WITHIN)) 
Meaning can also be carried by varbs conveying the opposite 
action and by nouns or adjectives closely related to the 
verb. 

3.4 Semantic networks for functions and purposes 

In addition to the descriptive information on the 
various components of the ARPANET and its hosts, which is 
handled in a manner very similar to the one in the geography 
data base, there are items describing what the unit does, or 
what it is used for, or what are its purposes. This 
information is stored under a special attribute that we 
shall call PURPOSE, and is represented by a list of action 
verbs, each followed by sublists of cases and 
characterizations of noun phrases. Within each PURPOSE 
there may be a PROCEDURE that explains how the indicated 
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purpose, function, or operation can be accomplished. 
Example: 

Under TECO, (the most commonly used editing language in 
TENEX) 

(PURPOSE NIL (EDIT NIL 
(ARENT NIL USER) 
(OBJ NIL SYMBOLIC/DATA) 
(INSTR NIL TECO) 
(PROCEDURE (I 2) 

(($L DELETE INSERT) NIL 

(AGENT NIL USER) 

(ODJ NIL STRING) 

(INSTR NIL TECC\ COMMAND J 

The information encoded above could be paraphrased in 
several ways. From the information directly under PURPOSE 
we could derive: 

six TECO edits symbolic data 

sl2 Users edit symbolic data with (or using or by means 
of) TECO 
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The information under PROCEDURES could be paraphrased as 
follows: 

si 3 To edit/ TECO commands delete and insert strings. 

sl4 To edit/ users delete and insert strings with (or 
using or by means of) TECO commands. 

These sentences are constructed in the same manner as sll 
and sl2~ the only difference is the affixation of the 
explicative "to edit/". 

In Section 5/ we shall present more examples of 
encodings of the general types we have described. 
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Section 4 BUILDING THE DATA BASE 

In the previous Sections we demonstrated the importance 
of being abln to handle action verbs in order to deal 
effectively with procedural information and we proposed a 
method for representing the meaning of action verbs. This 
entailed the design and implementation of an essentially nev; 
SCHOLAR system, having the ability to comprehend questions 
formulated with action verbs, and the building of a data 
base of information about the ARPANET. The system that 
emerged from tliis work is called NET-SCHOLAR and is 
described in Section 5. 

Building a data base covering significant aspects of 
the ARPANET is in itself a large, time consuming 
undertaking. Not only a large amount of information must be 
gathered, digested, condensed, and represented within the 
structural constraints of senantic networks, but the 
interdependency of the units of information that constitute 
the data base make the task of building it a highly 
interrelated one. It was therefore essential to have a 
system of routines with which to exercise the data as the 
semantic network was built, without waiting for the 
NET-SCHOLAR routines to be available. The solution that we 
adopted was to develop a temporary, provisional system that 

41 



we called TMPNET-SCHOLAR. 

TMPNET-SCHOLAR consists of Slightly modified GEO-SCHOLAR 
routines and a data base that is easily transformable into a 
NST-SCHOLAR data base. 

Consider, for example, the representation of 

"The CTRL-A command deletes the last character in the 
input string* 

In the NET-SCHOLAR data base, such information is 
represented as follows: 

Under CTRL-A/COMMAND 
(PURPOSE NIL 

(DELETE NIL 

(INSTR NIL CTRL- A/COMMAND) 
(OBJ NIL (CHARACTER NIL LAST)) 
(LOC NIL (IN NIL INPUT/STRING)))) 

while in the TMPNET-SCHOLAR data base it appears as 
(PURPOSE NIL 

(DELETE NIL 

(CHARACTER NIL LAST) 
(IN NIL INPUT/STRING))) 
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TMPNET-SCHOLAR can not handle action verbs on input but 
gives the appearance of such ability in output. One can not 
ask "What does CTRL-A do?" or "What does CTRL-A delete," but 
one can ask "V7hat is the purpose of CTRL-A" and get back the 
answer "The purpose of CRTL-A is to delete the last 
character in the input string". TMPNET-SCIIOLAR can not even 
begin to compare , in usefulness and in freedom of 
expression, with what is achievable with NET-SCHOLAR. But 
with its help wo were able to shake down a fairly sizable 
amount of information and construct a data base that, with 
relatively moderate effort, was converted to the NET-SCHOLAR 
format once thp routines had been implemented. 

The following protocol, obtained on-line with the 
TMPNET-SCHOLAR, gives an idea of its capabilities and 
limitiitions. 
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WH/JiT IS THE AKFAIMFT?* 



ANP/^NFT IS A COMPUTFR NFTUORK* 



♦TELL ME ABOUT COMPUTEF NFTUORKS* 



COMFUTEfr NETWORKS ARE A METUORK OF THE 
INTERCONNECTED COMPUTER SYSTEMS. 

THE STKUCTURF IS GRAPH LIKE* 

THE NODES ARE THE COMPUTER SYSTEMS* 

THE ItNTERMEDI ATE NODES ARE THE MESSAC-E PROCESSORS. 
THE TERMINAL NODES ARE THE HOSTS. 
THE LINK IS THE DATA LINES. 



♦WHAT ARE THE HOSTS OF THE ARPANET?* 



THE MAIN HOSTS ARE BFN-TENFX# SRI-ARC# 
ML»LT1CS# RAnD-F«CCj AND UTAH- 10. 



♦TELL ME SOMETHING AtOUT BbN-TENEX.* 



PPN-TENEK IS A COMPUTER SYSTEM AND A rtOST. 

THE COMPUTER IS THE MODIFIED PDP-10. 
THE CORE OF THE MEMORY IS PAGED. 
THE OPERATING SYSTFM IS TFNEX. 

THE MAIN SUBSYSTEM IS THE EXEC. 

THE NETWORK CONTROL PROGRAM IS TELNET. 



♦WHAT IS THE PURPOSE OF THE EXEC?^ 



THE P1»RP0SE OF THE EXEC IS TO ENAbLE TriE USER TO 
INTERACT WITH TENEX. 



♦ TELL ME SOMETHING ABOUT THE COMMANDS OF TrtE EXEC^ 



THE KINDS OF THE C0MiV|ANDS OF THE EXEC ARE EDITING^ 
ACCESSING, FILE HANDLING, AND PROGRA.^i CONTROLLING. 



♦ VHAT IS AN EDITING COMMAND?^ 



'JU*5T A MINUTE ...) THE EDITING COMMAND IS TO EDIT. 
- THE PURPOSE IS TO CHANGE TEXT^ 
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UhAT AKt: The FDITINO COUvIANIi? Oh THF FXFC:?* 



ThF FUITI.MC UOt^MAiMbS OF iHF EXEC Pht TmF CTFl-A 
COMr^Ai\rj# THF CTRL-V COMrvj^ND, ThF CTWL-F' COMMANL* 
A»ND iHK CTkL-X COMi^AND. 



♦UHAT IS THE PUPPOSF OF THF CTPL-A COMMAND?* 



ThF PURHOSF OF ThF CThL-A COMMANb IS TO 
DFLFTF THF LAST ChAPACTFF^ IN TuF INPUT STKJNO. 



★ WHAT IS ThF PkOCFDI I><F?* 



THF FROCFDIikF IS TO TYPF •'CTPL*' AND •'A*' ANU 
S IMULTANFOUSLY. 



♦WmAT is Th£ response?* 



THF RFSPONSF IS TO PPINT ••\** FOLLOWF [> by ThF DFLFTFD 
CHAf^^ACTFR. 



♦ TFLL ME i>^OPE AbOUT TriF COMM AD5\ S\ DN OF ThE EXEC«* 



The COMPONENTS OF Ihhl COMMANDS OF THE FXEC AKE 
ThF KFYWOPD* THE AkfUMENTS* ThE NOISF UORDS* ANli 
THF SUBCOMMANDS. 

THF INPUT MODES ARE THE LONT INPPT MODF# THE 

Short input modf^ and thf rfcognitioni mole. 



♦TFLL MF AFOUT ThE NOISF UORDS* 



THE NOISF U'OKDS ARE INSERTED FOR CLARITY. 
ThFY are ENCLOSED VITHlN PAhFNTHFSES. 
THEY ARE NOT ACTIVE COMPONENTS. 



♦WHAT IS THE PURPOSE OF THE NOISE WORDS? ♦ 



THE PURPOSES OF THE NOISE WORDS ARE TO 
FACILITATE THE USAGE OF THE COMMAND AND TO CUE THE 
i"SER ABOUT THE NEXT COMPONENT* 



♦ VHAT IS AN INPUT MODE?* 



AN INPUT MODE IS A MODE* 

IT IS A VAY TO TYPE THE INPUT STRING. 



♦TELL ME ABOUT THE RECOGNITION MODE OF THE EXEC** 



THE RECOGNITION MODE OF THE EXEC IS BEST FOR 
THE INEXPERIENCED USERS* 

THE PROCEDURE IS TO TYPE THE INITIAL STRING OF 
THE KEYVORD FOLLOWED BY THE ALTMODE COMMAND^ EACH 
ARGUMENT FOLLOWED BY THE ALTMODE COMMAND^ AND THE 
TERMINATOR* 

THE RESPONSES ARE EITHER TO PRINT THE WHOLE COMPONENT 
AND THE NOISE WORDS IF THE USER TYPED THE 
SUFFICIENT CHARACTERS OR TO PRINT THE BELL CHARACTER 
TO PROMPT THE USER FOR MORE CHARACTERS* 



♦WHAT IS THE PURPOSE OF THE ALTMODE COMMAND?* 



THE PURPOSE OF THE ALTMODE IS TO ACTIVATE THE 
RECOGNITION MODE OF THE EXEC* 



♦ TFLL ME MOKE ABOUT THE COMPONENTS OF THE COMMANDS Of THE EXEC* 

THE FIRST COMPONENT OF THE COMMANDS OF THE EXEC 
I S THE KFYU'OhD. 

IT IP THE NAME OF THE COMMAND. 

THE DEFAULT IS "RUN". 
THE NOISE WORDS ARE INSERTED FOR CLARITY. 
THEY ARE ENCLOSED UITHIN PARENTHESES. 
THF SUBC0>4MANDS ARE OPTIONAL. 
THEY ARE NOT ESSENTIAL. 

THE STRUCTURE IS TmF SEQUENCE OF FIELDS. 

*TELL ME MORE ABOUT THE STRUCTURE OF THE COMMANDS** 

THE STRUCTURE OF THE COMMANDS IS THE SEQUENCE 
OF THE FIELDS. 

EACH FIELD IS EITHER THE KEYWORD OR THE 
ARGUMENT FOLLOWED BY THE NOISE WORD* 
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♦ WHAT IS THE CALLIMt PROCEUNUDLlFcF FOK THii EXEC?* 



WAIT A MINUTF. I'M CMECKiiNO f^Y FILES. 
OK. LET'S CONTINUF. 



THE CALLING FROCFOU^E OF THE EXEC IS TO TYPF THE 
CTRL-C COMMAND. 



♦ TELL ME ABOUT THF NETWORK CONTROL FROC-RAM OF FFN-TENEX.* 



THE NETWORK CONTROL PROGRAM OF BBN-TENEX IS 
TELNET. 



♦TELL ME AEOUT TELNET.* 



TELNET IS A NETWORK CONTROL PROGRAM. 
IT IS A SUBSYSTEM OF TENEX, 

THE MAIN PURPOSES ARE TO MAKE THF CONNECTIONS 
AND TO PROVIDE THE INFORMATION ABOUT THE ARPANET 
TO THE USER. 

THE MAIN COMPONENT IS THE COMMAND INTERPRETER. 

THT ESCAPE CHARACTER IS CTRL-Z INITIALLY. 

THF KINF/S OF THE COMMANDS ARE ACCESSING* ECriO 
CONTROLLING* A»M D INFORMING. 

THE INPUT MODE IS LOCAL A;\D REMOTE. 



♦WHAT IS THF ENTERING PROCEDURE?* 



THE ENTERING PROCEDURE IS TO TYPF "TELNET" FOLLOWED 
THF TERMINATOR. 



♦WHAT IS THE CONNECTING PROCEDURE?* 



THE CONNECTING PROCEDURE IS TO ISSUE THE ACCESSING 
COMMANDS. 



♦TELL ME ABOUT THF ACCESSING COMMANDS. ♦ 



THE ACCESSING COMMAND IS THE CONN ECT I ON . TO COMMAND 
AND DISCONNECT\COMMAND. 
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♦ WHAT IS THE PURPOSE OF THE CONNECTI ON . TO COMMAMD?* 



THE PURPOSE OF THE COMN ECTI ON . TO COMMAND IS 
TO CONNECT TO THE REMOTE HOST. 



♦WHAT IS THE PROCEDURE?* 



THE PROOF UUKE IS TO TYPE THE NAME Of THE HOST 
FOLLOUED FY THE TERMINATOR. 



♦WHAT IS A REMOTE hOST?* 

THE REMOTE HOST IS AT THE REMOTE SITE* 

♦ TELL ME ABOUT THE COMMAND I\TERPFETER OF TFLNET.+ 



THE PURPOSE OH THE COMMAND INTERPRETER IS 
TO HANDLE THE COMMANDS TYPED BY THE USER. 
THE MODES ARE FANCY AND QUERY. 
THEY ARE FREELY INTERMIXED. 



♦TELL ME ABOUT THE QUERY MODE.^ 



THE PURPOSE OF THE QUERY MODE IS TO FIND THE 
POSSIBLE CONTINUATION OF THE COMMAND* 

THE RESPONSE IS ALL THE POSSIBLE CONTINUATIONS* 
THE PROCEDURE IS TO TYPE "?"* 



♦TELL ME ABOUT THE FANCY MODE.^ 



THE PURPOSE OF THE FANCY MODE IS TO FhOCESS ONLY 
THE CORRECT CHARACTERS. 



♦TELL ME MORE ABOUT THE FANCY MODE.+ 



THE PURPOSE OF THE FANCY MODE IS TO PROCESS ONLY 
THE CORRECT CHARACTERS* 

THE INCORRECT CHARACTERS ARE NOT ECHOED* 

THE ECHO IS THE BELL CHARACTER INSTEAD* 
THE COMMAND IS NOT ABORTED* 

FRir 



*U'HAT IS THE PURPOSE OF REMOTE i>lODE?* 



I'M AFRAID I CAN'T ANSUER YOUR QUESTION. 



*TELL ME ALL VOU KNOW ABOUT R\RTHE REMOTE MODE Of TELNf •* 



I DON'T KNOW ENOUGH TO ANSWER YOU* SORRY. 



♦TELL ME ALL ABOUT THE INPUT MODES OF TELNET.* 



THE INPUT MODES OF TELNET ARE LOCAL AND REMOTE. 

THE PURPOSE OF THE LOCAL INPUT MODES IS 
TO ACTIVATE THE COMMAND INTERPRETER. 

IT IS TO ENABLE THE USER TO TALK TO TELNET. 

THE ENTERING PROCEDURES ARE EITHER TO ISSUE 
LOCAL. MODESCOMMAND OR TO TYPE THE ESCAPE CHARACTER. 

THE PURPOSE OF THE REMOTE INPUT MODES IS 
TO ENABLE THE USER TO TALK TO THE HOST CONNECTED. 

THE ENTERING PROCEDURE IS TO ISSUE EITHER 
THE CONNECTION*TO COMMAND OR THE 
REMOT£*MCD£ COMMAND. 
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Section 5 DESCRIPTION OP NET-SCHOLAR 



NET-SCHOLAR is a computer program to answer questions 
about the ARPA network. Like TMPNET- SCHOLAR, it is a 
SCHOLAR-type system with a data base in the form of a 
semantic network. It is different in that it has an ability 
to handle verbs and verb relations in understanding the 
user's questions and in formulating answers* Verbs are 
crucial in this system because most of what a user wants to 
know about the network is procedural in nature. 

A case grammar representation (described in Section 3) 
is used for verbs. Cases, of course, do not have a 
one-to-one correspondence to surface-structure placement in 
sentences. For instance, in the sentence ""The Ctrl-A 
command deletes a character the Ctrl-A command is the 
instrument in the deleting, and in the sentence "I can 
delete a character with the Ctrl-A command", the Ctrl-A 
command is again the instrument, in spite of the fact that 
it is the subject in the one sentence and the object of a 
preposition in the other* 
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Some sample pieces of data base are shown in Piqure 1. 
The DELLTE section under CTRL-i\ COMMAND gives information 
about what the Ctrl-A command deletes^ using the standard 
cases of AGENT (filled by the noun "user") ^ INSTRument 
(filled by Ctrl-A command)^ OBJect (last character)^ and 
LOCative (input string) . Similarly^ the ENTER part of 
COMPUTEf\ SYSTEM tells how to enter a computer system, even 
giving a complicated PROCEDURE. (Notice that the procedure , 
in its turn, can have verbs , with their cases, embedded 
within it.) Purposes, conditions, side-effects, etc., are 
also stored in this framework. 

In the verb entries (i.e., the DELETE entry itself, not 
the DELETE part of CTRL-a\ COMMAND ) , the attribute "CASES" of 
the verb tells what kinds of concept nouns fit into the 
roles of agent, instrument, etc. This information is very 
general and is used internally in the processing of a 
question — in particular, in the case assignment — and is not 
directly used to print out an answer. 
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computerX system 

( { (CN computer\system) ) 

(SUPERC NIL SYSTEM) 

(SUPERP NIL C0MPUTER\ CENTER) 

(ENTER (I 2) 

(AGENT NIL USER) 
(INSTR NIL ARPAX NETWORK) 
(OBJ NIL COriPUTER\ SYSTEM) 
(PROCEDURE NIL 

($SEQ (CALL NIL (AGENT NIL USER) 
(OBJ NIL TELNET)) 
(TYPE NIL (AGENT NIL USER) 
(OBJ NIL (NAME NIL 

(OF NIL 

computerNsystem] 
(login nil (agent nil user) 

(INSTR NIL LOGINNcOMMAND} 
(LOC NIL (TO NIL 

C0MPUTER\SYSTEM1 

(EXAMPLES (I 4) 

($E0R MULTICS BBN-TENEX RAND-RCC SRI -arc UTAH-10)) 

CTRL-A\C0Mf1AND 

( ( (XN CTRL-A\C0W1AND CTRL-A) ) 
(SUPERC NIL EDITING\C0MMA1^D) 
(SUPERP NIL EXECUTIVE) 

(PURPOSE NIL (DELETE NIL (AGENT NIL USER) 

(OBJ NIL (CHARACTER NIL LAST) ) 
(INSTR NIL CTRL-A\C0MMAND) 
(LOC NIL INPUT\STRING) ) ) 

DELETE 

( ( (VRB DELETE) ) 
(SUPERC NIL EDIT) 
(CASES (I 6 B) 

(AGENT NIL USER) 

(OBJ NIL DATA FILE JOB) 

(INSTR NIL PR0GRAMMIN(?\LANGUAGE PROGR.'.M 

C0MPUTEI\ SYSTEM JSYS EDITING\C0r4MAND COMMAND] 

ENTER 

( ( (VRB ENTER) ) 
(CASES (I 6 B) 

(AGENT NIL USER) 

(INSTR NIL COMMAND SUBSYSTEM COMPUTER\NETlTORK) 
(OBJ NIL COMPnTER\SYSTEM 0PERATING\SYSTEM] 



Figure 1 

Some Partial Data Base Entries in NET-SCHOLAR 
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WHAT COMMAND DELETES A CHARACTER 
((NP INSTR (WHADJ WHAT) (CN COMMAND)) 

(VP (VRB DELETE +S) ) 

(NP OBJ (DET A) (CN CHARACTER) ) ) 

HOW DO I ENTER SRI-ARC 
((WHADV urn) 
(AUX DO) 

(NP AGENT (PRN I) ) 

(VP (VRB ENTER) ) 

(NP OBJ (XN SRI -ARC) ) ) 

WHERE IS DATA STORED 
((WHADV WHERE) 

(AUX BE +S) 

(NP OBJ (CN DATA) ) 

(VP (VRB STORE +PAST) ) ) 

TELL ME ABOUT THE TENEX EXEC CTRL-A COMMAND 
((VT (VRB TELL\ ME\ ABOUT) ) 
(NP OBJ (DET THE) (XN TENEX) (XN EXECUTIVE) (XM 
CTRL-A\COMMAND) ) ) 

WITH WHAT PROGRAM CAM I ACCESS THE NETWORK 
((PRP WITH) 

(NP INSTR (WHADT MIAT) (CN PROGRAM)) 

(AUX CAN) 

(NP AGENT (PPil I) ) 
(VP (VRB ACCESS) ) 

(NP OBJ (DET THE) (CN COMPUTER\nETWORK) ) ) 



Figure 2 

Sentonces After Parsing and Case Assignment 



ERIC 



Net-Scholar^s processing of a question is divided into 
four parts—parsing, case assignment, retrieval, and 
sentence-generation. 

The first step is parsing. The parser is somewhat 
unsophisticated, but it is adequate for the purpose. It 
takes the input and builds a tree structure for the 
sentence, based on a restricted English grammar. It 
currently handles only simple constructions, e.g., no 
relative clauses. Noun phrases , though, are allowed to be 
somewhat complex, with adjectives, nouns, and prepositional 
phrases modifying the noun head. Some examples of parsed 
sentences are in Figure 2. 

Case assignment takes the parsed sentence, finds the 
main verb, and figures out the relation of each noun phrase 
to it. The output looks like a parse tree, with the 
addition of a case label at the beginning of each noun 
pharse (NP) expression. In the first sentence in Figure 2, 
••what command** has been labelled as an instrument, and ••a 
character*^ is an object. 
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Case assignnent bases its decisions mostly on semantics 
and heavily uses the match-on-superc subroutine. 
Match-on-superc compares any two concepts and decides 
whether they can be the same {e,g,r character and 
information) r or whether the two are distinct (e^q,, 
character and computer) . It goes up the superconcept chain 
(from character to data to information) for each of the two 
concepts and sees whether there is an intersection of the 
two chains. If there is no intersection, the two concepts 
are distinct; if one is directly on the superconcept chain 
of the other r then the two concepts coincide. If there is 
an intersection further up the chain (character and word 
both have the superconcept data) , it is more complicated and 
further checks must be made. If the subroutine can't prove 
that they are different^ it concludes that it doesn't know. 

In assigning cases, a match-on-superc is tried between 
the noun head (the noun which the NP is about) in each NP of 
the sentence, and each noun in each case under the CASES of 
the verb in the data base. If there is a match — e.g., 
between "character" in the sentence and "data" in the OBJ 
case under DELETE — the case assignment routine takes note of 
the case (OBJ) and the word that matched (data) and 
continues on to try the others. A weight is also assigned 
based on the goodness of the match. For instance. 
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"character" would match "character" perfectly^ but a match 
with "data" is slightly less good^ since characters are data 
but so are a lot of other things. 

In addition to match-on-superc^ case assignment uses 
syntactic clues ^ such as the presence of certain 
prepositions ("with" for an instrument) or the noun's 
position as surface object or surface subject of the 
sentence. It also uses match-on-superp (match on superpart^ 
similar to match-on-superc) , on locative and instrumental 
cases only. Eventually, all the semantic and syntactic 
possibilities ar«^ considered, their v/eights are compared, 
and the best case assignment is selected. 

Now the sentence is ready for the retrieval process. 
For retrieval, the problem in a guestion like "What command 
deletes a character" is to find information somewhere in the 
data base that answers the guestion. The retrieval routine 
searches under each noun in the sentence ("command" and 
"character"), under the things that matched in the case 
assignment, examples of those things, and various other 
likely places, until it finds what it wants. At each place, 
it looks for the appropriate verb ("delete") and, if it 
finds it, matches (with match-on-superc and match-on-superp) 
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€lie casG nouns stored there against the case nouns in the 
sentence^ For example (Figure 1), if it finds "delete" 
under "Ctrl-AXcommand" , it matches "Ctrl- A command" aqainst 
"command" and "character" against "character". Here the 
match is good, and Net-Scholar answers as shown in the first 
example in Fiaure 3. If the match had been bad — o.g., if 
the information had been about deleting a word instead of 
about deleting a character — then the search would have 
continued. 
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*WHAT COMMAND DELETES A CHARACTER* 



THE CTRL-A COMMAND DELETES THE LAST CHARACTER IN 
AN INPUT STRING. 



*HOW DO I ENTER SRI-ARC* 



THE USER ENTERS A COMPUTER SYSTEM USING THE ARPA NETWORK, 
PROCEDURE: FIRST HE CALLS TELNET; 
THEN HE TYPES A NAME OF A COMPUTER SYSTEM; 
THEN HE LOGS IN TO A COMPUTER SYSTEM USING THE LOGIN 
COMMAND. 



*WHAT DOES TELNET DO* 



TELNET PROVIDES TO A USER INFORMATION ADOUT THE ARPA 
NETWORK; 

OR IT MAKES CONNECTIONS WITHIN THE ARPA NETWORK. 



*CAN I ENTER BBN-TENEX WITH CTRL-C* 



THE USER ENTERS BBN-TENEX USING THE CTRL-C COMMAND. 
PROCEDURE: HE TYPES A CTRL AND A C SIMULTANEOUSLY. 



*HOW DO YOU TRANSFER A FILE OVER THE NETWORK* 



FTP TRANSFERS FILES OVER THE ARPA NETWORK. 



♦WHERE IS THE NETWORK INFORMATION CENTER* 



Figure 3 

Actual Questions With Answers by NET-SCHOLAR 
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THE NETWORK INFORMATION CENTER IS AT SRI-ARC. 

*WHAT IS THE PURPOSE OF A COMPUTER SYSTEM* 

A COMPUTER SYSTEM PROCESS DATA AND PROGRAMS 
AUTOMATICALLY. 

♦WHERE IS DATA STORED* 

THE USER STORES PROGRAM OR 1>ATA IN FILES. 

Fiaure 3 (continued) 
Actual Questions With Answers by NET-SCHOLAR 
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That is the basic procedure , but there are a lot of 
special thinqs to handle. For instance, in the question 
"What command deletes a character I don't want to be told 
"A command deletes a character" , hut rather "The Ctrl-A 
command deletes a character". That is, "command" is not an 
adequate match for "what command" and something more 
specific is needed. (This is a peculiar property of WH 
words, like "what".) Each question type has its own 
idiosyncracies about the way it wants retrieval to handle 
it. For example, a •''how" question is asking for an 
instrument or a procedure, and a "tell me about" question 
just wants a slice of data base information about something. 

Retrieval, of course, also has the task of evaluating 
complex noun phrases. This may involve the straight-forward 
searching for an attribute under an object, or the applying 
of any of a number of inferences. 

When the information to answer a question has been 
found, all that remains is for the sentence-generation 
routine to put it into sentence form and print it out. 
Basically, the routine finds the primary verb, orders the 
cases for that verb, adjusts the subject and verb to be 
singular or plural, and puts in the necessary articles. 
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prepositions^ etc. When the piece of information is complex 
and eml^edded, several sentences may be made, as in the 
second example in Figure 3. 



(CTRL-?\COMMAND NIL (DELETE NIL 

(OBJ NIL (CHARACTER NIL LAST)) 
(INSTR NIL CTRL-A\COr^ND) 
(LOG NIL INPUT\STRING] 

"The CTRL-A command deletes the last character in an input 
string." 



Figure 4 

Example of Input and Output of Sentencf3 Generation 



In Figure 4, there is a sample piece of information and 
the sentence produced from it. DELETE is a regular verb in 
the cases it takes ^ and the elements present arc ordered: 
INSTR + VERB + OBJ + LOG. If an AGENT had also been 
present, a different order would have been used. To the 
ordered list of elements, articles are added and modifiers 
are placed, as in "the last character", prepositions are 
added, "in an input string", the verb is made to agree, 
"deletes", and finally the sentence is printed "The Gtrl-A 
command deletes the last character in an input string." 
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Section 6 SUMMARY AND CONCLUSIONS 



6.1 Summary. 

We have constructed a SCHOLAR system, NET-SCHOLAR, 
capable of dealing with functional and procedural 
information within the context of the ARPA network. 
Previous SCHOLAR systems (referred to as GEO-SCHOLAR) dealt 
only with static information, within the context of the 
geography of South America. 

NET-SCHOLAR incorporates several basic changes with 
respect to GEO-SCHOLAR, particularly in terms of handling 
routines and semantic network representations • 

The most important class of new elements incorporated 
to the semantic network are action verbs. Following the 
work of Fillmore, our representation of the meaning of verbs 
is based on Case structures — a characterization of the 
noun phrases that "agree" with the action verb under v arious 
Case relationships (such as Agent, Instrument, Object, etc) 
in the formation of meaningful and relevant simple 
sentences. An example would be "People (as Agents) activate 
programs (as Objects) with program activation commands (as 
Instruments)". Particular purposes or actions related to 
the various topics or objects represented in the data base 
incorporate instantiations of the nouns characterized 




62 



globally in the Caso structure (i.e. "Users call the EXEC 
with CTRL-C") • Procedural information is incorporated as 
means to achieve a purpose or action. Sometimes the 
specification of an Instrument is enough to teacii a user how 
to perform a certain action (i.e.: call the EXEC), In most 
instances hov/ever, another action or sequence of actions 
must be specified. The following dialogue is seTf 
explanatory: 

-How do I call the EXEC? 

-You call the EXEC with CTRL-C 

-How? 

-By typinq CTRL-C 
-How? 

-By ccpressinq the CTRL key and the C key 
simultaneously. 

In terms of handlinq routines, the most important new 
features of NET-SCHOLAR are the English Comprehension and 
the retrieval package. 

The English Comprehender is designed to understand 
action sentences having a simple structure, that is 
sentences incorporating a main action verb but no 
subordinated sentences. We say that an action sentence is 
understood by NET-SCHOLAR when all its noun phrases have 
been assigned Case relationships with respect to the main 
verb. An example is "Users (as Agents) call the EXEC (as 

ErJc 63 



Object) with CTRL-C (as Instrument)." The main new feature 
of the NET-SCHOLAR retrieval package is its ability to find 
a particular instantiation of an action verb in the data 
base, that matches a certain set of Case relationships. 

An important class of questions that the new routines 
are designed to answer, can be conceived as sentences that 
point out a missing Case relationship with respect to the 
main verb. In this case, the retrieval routines of 
NET-SCHOLAR are designed to find an occurrence in the data 
base of the verbs such questions use, for which the nouns 
assigned to the various Cases by the English Comprohender 
are acceptable under such an occurrence. It all happens as 
if, for example, the question "How do I call the EXEC" were 
paraphrased as ••Find an occurrence of the verb 'call* in the 
data basn, for which 'I* is an acceptable Agent and 'EXEC* 
an acceptable Object, and give me the Instrument." 

6.2 Conclusions 

The present NET-SCHOLAR satisfies the design 
requirements outlined above, and to that extent it has been 
successful. Much remains to be done, however, to make 
systems of the NET-SCHOLAR type more useful to people. In 
the following we outline som*^ of the areas in which we feel 
more work is necessary. 
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1) Further work in English comprehension is necessary 

to enable the system to understand hypotheses, 
causal explanations, and descriptions of situations 
as offered by users— -unlike factual auestions, 
these often involve complex sentence structures 
that are beyond the powers of the present system. 

Also, the system would have a much more 
natural feel for its users if it were able to deal 
effectively with anaphoric references (pronoun 
references, for example) and with paraphrases 
(different ways of saying the same thing). The 
present NET-SCliOLAR is almost powerless in this 
regard • 

2) In NET-SCHOLAR we have represented the meaning of 

actions and procedures by describing them in terms 
of other actions and procedures- This conceptual 
representation carries implicit with it the 
existence of an external ••understander" that is 
capable of carrying out and executing the actions 
described in the procedural information. What we 
would like to a cd is a representation of actions as 
simulation programs, so that we may obtain the 
consequences of such actions by executing the 
simulation and examining the resultant computation. 
In this manner, we could avoid deducing the distant 



side effects of an action, which is often very 
difficult, by actually simulating the action and 
observing the result. 



3) The NET-SCHOLAR system as it now exists is only able 
to answer questions. To make NET-SCHOLAR a really 
effective CAI system it must be given the ability 
to present information and ask questions. Since 
sophisticated tutorial facilities now exists in 
GEO-SCHOLAR, similar facilities should be 
incorporated in NET-SCHOLAR. These facilities 
should be augmented with tutorial strategies that 
are well-suited to teaching procedural knowledge, 
such as teaching by instantiation and demonstration 
(the teacher presents examples and works them out), 
and teaching by doing (asking the student to 
perform a task that encompasses the procedural 
knowledge being taught) . 

In our follow on work, we will develop a system to help 
naive users of the NLS (On Line System) and we shall tackle 
some of these problems. 
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